Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38536693

RESUMO

This paper studies how to flexibly integrate reconstructed 3D models into practical 3D modeling pipelines such as 3D scene creation and rendering. Due to the technical difficulty, one can only obtain rough 3D models (R3DMs) for most real objects using existing 3D reconstruction techniques. As a result, physically-based rendering (PBR) would render low-quality images or videos for scenes that are constructed by R3DMs. One promising solution would be representing real-world objects as Neural Fields such as NeRFs, which are able to generate photo-realistic renderings of an object under desired viewpoints. However, a drawback is that the synthesized views through Neural Fields Rendering (NFR) cannot reflect the simulated lighting details on R3DMs in PBR pipelines, especially when object interactions in the 3D scene creation cause local shadows. To solve this dilemma, we propose a lighting transfer network (LighTNet) to bridge NFR and PBR, such that they can benefit from each other. LighTNet reasons about a simplified image composition model, remedies the uneven surface issue caused by R3DMs, and is empowered by several perceptual-motivated constraints and a new Lab angle loss which enhances the contrast between lighting strength and colors. Comparisons demonstrate that LighTNet is superior in synthesizing impressive lighting, and is promising in pushing NFR further in practical 3D modeling workflows.

2.
Nano Lett ; 24(9): 2789-2797, 2024 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-38407030

RESUMO

Two-dimensional materials are expected to play an important role in next-generation electronics and optoelectronic devices. Recently, twisted bilayer graphene and transition metal dichalcogenides have attracted significant attention due to their unique physical properties and potential applications. In this study, we describe the use of optical microscopy to collect the color space of chemical vapor deposition (CVD) of molybdenum disulfide (MoS2) and the application of a semantic segmentation convolutional neural network (CNN) to accurately and rapidly identify thicknesses of MoS2 flakes. A second CNN model is trained to provide precise predictions on the twist angle of CVD-grown bilayer flakes. This model harnessed a data set comprising over 10,000 synthetic images, encompassing geometries spanning from hexagonal to triangular shapes. Subsequent validation of the deep learning predictions on twist angles was executed through the second harmonic generation and Raman spectroscopy. Our results introduce a scalable methodology for automated inspection of twisted atomically thin CVD-grown bilayers.

3.
IEEE Trans Neural Netw Learn Syst ; 34(12): 10473-10486, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35771784

RESUMO

In singular models, the optimal set of parameters forms an analytic set with singularities, and a classical statistical inference cannot be applied to such models. This is significant for deep learning as neural networks are singular, and thus, "dividing" by the determinant of the Hessian or employing the Laplace approximation is not appropriate. Despite its potential for addressing fundamental issues in deep learning, a singular learning theory appears to have made little inroads into the developing canon of a deep learning theory. Via a mix of theory and experiment, we present an invitation to the singular learning theory as a vehicle for understanding deep learning and suggest an important future work to make the singular learning theory directly applicable to how deep learning is performed in practice.

4.
IEEE Trans Pattern Anal Mach Intell ; 45(3): 3904-3917, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-35759594

RESUMO

Video summarization aims to automatically generate a summary (storyboard or video skim) of a video, which can facilitate large-scale video retrieval and browsing. Most of the existing methods perform video summarization on individual videos, which neglects the correlations among similar videos. Such correlations, however, are also informative for video understanding and video summarization. To address this limitation, we propose Video Joint Modelling based on Hierarchical Transformer (VJMHT) for co-summarization, which takes into consideration the semantic dependencies across videos. Specifically, VJMHT consists of two layers of Transformer: the first layer extracts semantic representation from individual shots of similar videos, while the second layer performs shot-level video joint modelling to aggregate cross-video semantic information. By this means, complete cross-video high-level patterns are explicitly modelled and learned for the summarization of individual videos. Moreover, Transformer-based video representation reconstruction is introduced to maximize the high-level similarity between the summary and the original video. Extensive experiments are conducted to verify the effectiveness of the proposed modules and the superiority of VJMHT in terms of F-measure and rank-based evaluation.

5.
IEEE J Biomed Health Inform ; 26(8): 3966-3975, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35522642

RESUMO

Generative Adversarial Networks (GAN) have many potential medical imaging applications, including data augmentation, domain adaptation, and model explanation. Due to the limited memory of Graphical Processing Units (GPUs), most current 3D GAN models are trained on low-resolution medical images, these models either cannot scale to high-resolution or are prone to patchy artifacts. In this work, we propose a novel end-to-end GAN architecture that can generate high-resolution 3D images. We achieve this goal by using different configurations between training and inference. During training, we adopt a hierarchical structure that simultaneously generates a low-resolution version of the image and a randomly selected sub-volume of the high-resolution image. The hierarchical design has two advantages: First, the memory demand for training on high-resolution images is amortized among sub-volumes. Furthermore, anchoring the high-resolution sub-volumes to a single low-resolution image ensures anatomical consistency between sub-volumes. During inference, our model can directly generate full high-resolution images. We also incorporate an encoder with a similar hierarchical structure into the model to extract features from the images. Experiments on 3D thorax CT and brain MRI demonstrate that our approach outperforms state of the art in image generation. We also demonstrate clinical applications of the proposed model in data augmentation and clinical-relevant feature extraction.


Assuntos
Processamento de Imagem Assistida por Computador , Imageamento Tridimensional , Artefatos , Humanos , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Tomografia Computadorizada por Raios X
6.
J Phys Chem Lett ; 13(17): 3831-3839, 2022 May 05.
Artigo em Inglês | MEDLINE | ID: mdl-35467342

RESUMO

The deformation and fracture mechanism of two-dimensional (2D) materials are still unclear and not thoroughly investigated. Given this, mechanical properties and mechanisms are explored on example of gallium telluride (GaTe), a promising 2D semiconductor with an ultrahigh photoresponsivity and a high flexibility. Hereby, the mechanical properties of both substrate-supported and suspended GaTe multilayers were investigated through Berkovich-tip nanoindentation instead of the commonly used AFM-based nanoindentation method. An unusual concurrence of multiple pop-in and load-drop events in loading curve was observed. Theoretical calculations unveiled this concurrence originating from the interlayer-sliding mediated layers-by-layers fracture mechanism in GaTe multilayers. The van der Waals force dominated interlayer interactions between GaTe and substrates was revealed much stronger than that between GaTe interlayers, resulting in the easy sliding and fracture of multilayers within GaTe. This work introduces new insights into the deformation and fracture of GaTe and other 2D materials in flexible electronics applications.

7.
IEEE Trans Image Process ; 30: 5264-5276, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34033540

RESUMO

Depth completion aims to recover a dense depth map from the sparse depth data and the corresponding single RGB image. The observed pixels provide the significant guidance for the recovery of the unobserved pixels' depth. However, due to the sparsity of the depth data, the standard convolution operation, exploited by most of existing methods, is not effective to model the observed contexts with depth values. To address this issue, we propose to adopt the graph propagation to capture the observed spatial contexts. Specifically, we first construct multiple graphs at different scales from observed pixels. Since the graph structure varies from sample to sample, we then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively. Furthermore, considering the mutli-modality of input data, we exploit the graph propagation on the two modalities respectively to extract multi-modal representations. Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively. The proposed strategy preserves the original information for one modality and also absorbs complementary information from the other through learning the adaptive gating weights. Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks, i.e., KITTI and NYU-v2, and at the same time has fewer parameters than latest models. Our code is available at: https://github.com/sshan-zhao/ACMNet.

8.
IEEE Trans Image Process ; 30: 3056-3068, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33556007

RESUMO

Tracking moving objects from space-borne satellite videos is a new and challenging task. The main difficulty stems from the extremely small size of the target of interest. First, because the target usually occupies only a few pixels, it is hard to obtain discriminative appearance features. Second, the small object can easily suffer from occlusion and illumination variation, making the features of objects less distinguishable from features in surrounding regions. Current state-of-the-art tracking approaches mainly consider high-level deep features of a single frame with low spatial resolution, and hardly benefit from inter-frame motion information inherent in videos. Thus, they fail to accurately locate such small objects and handle challenging scenarios in satellite videos. In this article, we successfully design a lightweight parallel network with a high spatial resolution to locate the small objects in satellite videos. This architecture guarantees real-time and precise localization when applied to the Siamese Trackers. Moreover, a pixel-level refining model based on online moving object detection and adaptive fusion is proposed to enhance the tracking robustness in satellite videos. It models the video sequence in time to detect the moving targets in pixels and has ability to take full advantage of tracking and detecting. We conduct quantitative experiments on real satellite video datasets, and the results show the proposed HIGH-RESOLUTION SIAMESE NETWORK (HRSiam) achieves state-of-the-art tracking performance while running at over 30 FPS.

9.
Bioinformatics ; 37(6): 785-792, 2021 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-33070196

RESUMO

MOTIVATION: There is growing interest in the biomedical research community to incorporate retrospective data, available in healthcare systems, to shed light on associations between different biomarkers. Understanding the association between various types of biomedical data, such as genetic, blood biomarkers, imaging, etc. can provide a holistic understanding of human diseases. To formally test a hypothesized association between two types of data in Electronic Health Records (EHRs), one requires a substantial sample size with both data modalities to achieve a reasonable power. Current association test methods only allow using data from individuals who have both data modalities. Hence, researchers cannot take advantage of much larger EHR samples that includes individuals with at least one of the data types, which limits the power of the association test. RESULTS: We present a new method called the Semi-paired Association Test (SAT) that makes use of both paired and unpaired data. In contrast to classical approaches, incorporating unpaired data allows SAT to produce better control of false discovery and to improve the power of the association test. We study the properties of the new test theoretically and empirically, through a series of simulations and by applying our method on real studies in the context of Chronic Obstructive Pulmonary Disease. We are able to identify an association between the high-dimensional characterization of Computed Tomography chest images and several blood biomarkers as well as the expression of dozens of genes involved in the immune system. AVAILABILITY AND IMPLEMENTATION: Code is available on https://github.com/batmanlab/Semi-paired-Association-Test. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Registros Eletrônicos de Saúde , Projetos de Pesquisa , Humanos , Estudos Retrospectivos , Tamanho da Amostra
10.
Med Phys ; 48(3): 1168-1181, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33340116

RESUMO

PURPOSE: To develop and evaluate a deep learning (DL) approach to extract rich information from high-resolution computed tomography (HRCT) of patients with chronic obstructive pulmonary disease (COPD). METHODS: We develop a DL-based model to learn a compact representation of a subject, which is predictive of COPD physiologic severity and other outcomes. Our DL model learned: (a) to extract informative regional image features from HRCT; (b) to adaptively weight these features and form an aggregate patient representation; and finally, (c) to predict several COPD outcomes. The adaptive weights correspond to the regional lung contribution to the disease. We evaluate the model on 10 300 participants from the COPDGene cohort. RESULTS: Our model was strongly predictive of spirometric obstruction ( r 2  =  0.67) and grouped 65.4% of subjects correctly and 89.1% within one stage of their GOLD severity stage. Our model achieved an accuracy of 41.7% and 52.8% in stratifying the population-based on centrilobular (5-grade) and paraseptal (3-grade) emphysema severity score, respectively. For predicting future exacerbation, combining subjects' representations from our model with their past exacerbation histories achieved an accuracy of 80.8% (area under the ROC curve of 0.73). For all-cause mortality, in Cox regression analysis, we outperformed the BODE index improving the concordance metric (ours: 0.61 vs BODE: 0.56). CONCLUSIONS: Our model independently predicted spirometric obstruction, emphysema severity, exacerbation risk, and mortality from CT imaging alone. This method has potential applicability in both research and clinical practice.


Assuntos
Aprendizado Profundo , Doença Pulmonar Obstrutiva Crônica , Enfisema Pulmonar , Humanos , Valor Preditivo dos Testes , Doença Pulmonar Obstrutiva Crônica/diagnóstico por imagem , Índice de Gravidade de Doença , Tomografia Computadorizada por Raios X
11.
Proc AAAI Conf Artif Intell ; 34(4): 6526-6533, 2020 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-32944410

RESUMO

The majority of state-of-the-art deep learning methods are discriminative approaches, which model the conditional distribution of labels given inputs features. The success of such approaches heavily depends on high-quality labeled instances, which are not easy to obtain, especially as the number of candidate classes increases. In this paper, we study the complementary learning problem. Unlike ordinary labels, complementary labels are easy to obtain because an annotator only needs to provide a yes/no answer to a randomly chosen candidate class for each instance. We propose a generative-discriminative complementary learning method that estimates the ordinary labels by modeling both the conditional (discriminative) and instance (generative) distributions. Our method, we call Complementary Conditional GAN (CCGAN), improves the accuracy of predicting ordinary labels and is able to generate high-quality instances in spite of weak supervision. In addition to the extensive empirical studies, we also theoretically show that our model can retrieve the true conditional distribution from the complementarily-labeled data.

12.
Front Neurosci ; 14: 350, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32410939

RESUMO

Accurate segmentation is an essential task when working with medical images. Recently, deep convolutional neural networks achieved a state-of-the-art performance for many segmentation benchmarks. Regardless of the network architecture, the deep learning-based segmentation methods view the segmentation problem as a supervised task that requires a relatively large number of annotated images. Acquiring a large number of annotated medical images is time consuming, and high-quality segmented images (i.e., strong labels) crafted by human experts are expensive. In this paper, we have proposed a method that achieves competitive accuracy from a "weakly annotated" image where the weak annotation is obtained via a 3D bounding box denoting an object of interest. Our method, called "3D-BoxSup," employs a positive-unlabeled learning framework to learn segmentation masks from 3D bounding boxes. Specially, we consider the pixels outside of the bounding box as positively labeled data and the pixels inside the bounding box as unlabeled data. Our method can suppress the negative effects of pixels residing between the true segmentation mask and the 3D bounding box and produce accurate segmentation masks. We applied our method to segment a brain tumor. The experimental results on the BraTS 2017 dataset (Menze et al., 2015; Bakas et al., 2017a,b,c) have demonstrated the effectiveness of our method.

13.
IEEE Trans Neural Netw Learn Syst ; 31(11): 4673-4687, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-31940557

RESUMO

Domain adaptation has proven to be successful in dealing with the case where training and test samples are drawn from two kinds of distributions, respectively. Recently, the second-order statistics alignment has gained significant attention in the field of domain adaptation due to its superior simplicity and effectiveness. However, researchers have encountered major difficulties with optimization, as it is difficult to find an explicit expression for the gradient. Moreover, the used transformation employed here does not perform dimensionality reduction. Accordingly, in this article, we prove that there exits some scaled LogDet metric that is more effective for the second-order statistics alignment than the Frobenius norm, and hence, we consider it for second-order statistics alignment. First, we introduce the two homologous transformations, which can help to reduce dimensionality and excavate transferable knowledge from the relevant domain. Second, we provide an explicit gradient expression, which is an important ingredient for optimization. We further extend the LogDet model from single-source domain setting to multisource domain setting by applying the weighted Karcher mean to the LogDet metric. Experiments on both synthetic and realistic domain adaptation tasks demonstrate that the proposed approaches are effective when compared with state-of-the-art ones.

14.
Proc Mach Learn Res ; 119: 10913-10924, 2020 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-33855300

RESUMO

Domain adaptation aims to correct the classifiers when faced with distribution shift between source (training) and target (test) domains. State-of-the-art domain adaptation methods make use of deep networks to extract domain-invariant representations. However, existing methods assume that all the instances in the source domain are correctly labeled; while in reality, it is unsurprising that we may obtain a source domain with noisy labels. In this paper, we are the first to comprehensively investigate how label noise could adversely affect existing domain adaptation methods in various scenarios. Further, we theoretically prove that there exists a method that can essentially reduce the side-effect of noisy source labels in domain adaptation. Specifically, focusing on the generalized target shift scenario, where both label distribution PY and the class-conditional distribution P X|Y can change, we discover that the denoising Conditional Invariant Component (DCIC) framework can provably ensures (1) extracting invariant representations given examples with noisy labels in the source domain and unlabeled examples in the target domain and (2) estimating the label distribution in the target domain with no bias. Experimental results on both synthetic and real-world data verify the effectiveness of the proposed method.

15.
Artigo em Inglês | MEDLINE | ID: mdl-31562086

RESUMO

The performance of single image super-resolution (SISR) has been largely improved by innovative designs of deep architectures. An important claim raised by these designs is that the deep models have large receptive field size and strong nonlinearity. However, we are concerned about the question that which factor, receptive field size or model depth, is more critical for SISR. Towards revealing the answers, in this paper, we propose a strategy based on dilated convolution to investigate how the two factors affect the performance of SISR. Our findings from exhaustive investigations suggest that SISR is more sensitive to the changes of receptive field size than to the model depth variations, and that the model depth must be congruent with the receptive field size to produce improved performance. These findings inspire us to design a shallower architecture which can save computational and memory cost while preserving comparable effectiveness with respect to a much deeper architecture.

16.
Proc Mach Learn Res ; 89: 3449-3458, 2019 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-31497776

RESUMO

Covariate shift is a prevalent setting for supervised learning in the wild when the training and test data are drawn from different time periods, different but related domains, or via different sampling strategies. This paper addresses a transfer learning setting, with covariate shift between source and target domains. Most existing methods for correcting covariate shift exploit density ratios of the features to reweight the source-domain data, and when the features are high-dimensional, the estimated density ratios may suffer large estimation variances, leading to poor prediction performance. In this work, we investigate the dependence of covariate shift correction performance on the dimensionality of the features, and propose a correction method that finds a low-dimensional representation of the features, which takes into account feature relevant to the target Y, and exploits the density ratio of this representation for importance reweighting. We discuss the factors affecting the performance of our method and demonstrate its capabilities on both pseudo-real and real-world data.

17.
Proc Mach Learn Res ; 89: 3487-3496, 2019 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-31497777

RESUMO

A key problem in domain adaptation is determining what to transfer across different domains. We propose a data-driven method to represent these changes across multiple source domains and perform unsupervised domain adaptation. We assume that the joint distributions follow a specific generating process and have a small number of identifiable changing parameters, and develop a data-driven method to identify the changing parameters by learning low-dimensional representations of the changing class-conditional distributions across multiple source domains. The learned low-dimensional representations enable us to reconstruct the target-domain joint distribution from unlabeled target-domain data, and further enable predicting the labels in the target domain. We demonstrate the efficacy of this method by conducting experiments on synthetic and real datasets.

18.
Proc Mach Learn Res ; 97: 2901-2910, 2019 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-31497778

RESUMO

In many scientific fields, such as economics and neuroscience, we are often faced with nonstationary time series, and concerned with both finding causal relations and forecasting the values of variables of interest, both of which are particularly challenging in such nonstationary environments. In this paper, we study causal discovery and forecasting for nonstationary time series. By exploiting a particular type of state-space model to represent the processes, we show that nonstationarity helps to identify causal structure and that forecasting naturally benefits from learned causal knowledge. Specifically, we allow changes in both causal strengths and noise variances in the nonlinear state-space models, which, interestingly, renders both the causal structure and model parameters identifiable. Given the causal model, we treat forecasting as a problem in Bayesian inference in the causal model, which exploits the timevarying property of the data and adapts to new observations in a principled manner. Experimental results on synthetic and real-world data sets demonstrate the efficacy of the proposed methods.

19.
Med Image Anal ; 58: 101534, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31352179

RESUMO

Quasi-static ultrasound elastography is an importance imaging technology to assess the conditions of various diseases through reconstructing the tissue strain from radio frequency data. State-of-the-art strain reconstruction techniques suffer from the inexperienced user unfriendliness, high model bias, and low effectiveness-to-efficiency ratio. The three challenges result from the explicitness characteristic (i.e. explicit formulation of the reconstruction model) in these techniques. For these challenges, we are the first to develop an implicit strain reconstruction framework by a deep neural network architecture. However, the classic neural network methods are unsuitable to the strain reconstruction task because they are difficult to impose any direct influence on the intermediate state of the learning process. This may lead the map learned by the neural network to be biased with the desired map. In order to correct the intermediate state of the learning process, our framework proposes the learning-using-privileged-information (LUPI) paradigm with causality in the network. It provides the causal privileged information besides the training examples to help the network learning, while makes these privileged information unavailable at the test stage. This improvement can narrow the search region of the map learned by the network, and thus prompts the network to evolve towards the actual ultrasound elastography process. Moreover, in order to ensure the causality in LUPI, our framework proposes a physically-based data generation strategy to produce the triplets of privileged information, training examples and labels. This data generation process can approximately describes the actual ultrasound elastography process by the numerical simulation based on the tissue biomechanics and ultrasound physics. It thus can build the causal relationship between the privileged information and training examples/labels. It can also address the medical data insufficiency problem. The performance of our framework has been validated on 100 simulation data, 42 phantom data and 4 real clinical data by comparing with the ground truth performed by an ultrasound simulation system and four state-of-the-art methods. The experimental results show that our framework is well agreed (average bias is 0.065 for strain reconstruction) with the ground truth, as well as superior to these state-of-the-art methods. These results can demonstrate the effectiveness of our framework in the strain reconstruction.


Assuntos
Técnicas de Imagem por Elasticidade/métodos , Interpretação de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Compressão de Dados , Humanos
20.
Artigo em Inglês | MEDLINE | ID: mdl-32076365

RESUMO

Unsupervised domain mapping aims to learn a function GXY to translate domain X to Y in the absence of paired examples. Finding the optimal G XY without paired data is an ill-posed problem, so appropriate constraints are required to obtain reasonable solutions. While some prominent constraints such as cycle consistency and distance preservation successfully constrain the solution space, they overlook the special properties of images that simple geometric transformations do not change the image's semantic structure. Based on this special property, we develop a geometry-consistent generative adversarial network (Gc-GAN), which enables one-sided unsupervised domain mapping. GcGAN takes the original image and its counterpart image transformed by a predefined geometric transformation as inputs and generates two images in the new domain coupled with the corresponding geometry-consistency constraint. The geometry-consistency constraint reduces the space of possible solutions while keep the correct solutions in the search space. Quantitative and qualitative comparisons with the baseline (GAN alone) and the state-of-the-art methods including CycleGAN [66] and DistanceGAN [5] demonstrate the effectiveness of our method.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...